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Resume 

Biochemical reaction networks are subjected to large fluctuations 
due to small molecule numbers, yet underlie reliable biological func- 
tions. Most theoretical approaches describe them as purely determinis- 
tic or stochastic dynamical systems, depending on which point of view 
is favored. Here, we investigate the dynamics of a self-repressing gene 
using an intermediate approach based on a moment expansion of the 
master equation, taking into account the binary character of gene ac- 
tivity. We thereby obtain deterministic equations which describe how 
nonlinearity feeds back fluctuations into the mean-field equations, pro- 
viding insight into the interplay of determinism and stochasticity. This 
allows us to identify a region of parameter space where fluctuations 
induce relatively regular oscillations. 

The simplest gene regulatory network is formed by a single gene which 
is regulated by its own protein. In the case of negative feedback (the gene 
is repressed by the protein), it serves as a paradigmatic genetic oscillator, 
which has for example been proposed as a model for the somite clock pQ. 
Accordingly, its dynamics has been actively investigated throughout mathe- 
matical biology [2H9] • The self-repressing gene reaction network involves four 
chemical species : the free gene G, RNA M, protein P and the DNA-protein 
complex GP. These molecular actors interact via the following biochemical 
reactions : 

G + P ^ GP ; G^G + M 

9 

GP A GP + M; M ?U M + P M 
M ^ : P i£+ 
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Parameter a (resp., 8) is the DNA-protein binding (resp., unbinding) rate, 
S m (resp., dp) is the linear mRNA (resp., protein) degradation rate, A (fi) is 
the transcription rate of unbound (resp., bound) gene and f3 is the transla- 
tion rate. 

Besides its biological significance, the self-regulating gene is interesting 
from a mathematical viewpoint because it combines two qualitatively dif- 
ferent types of variables. On the one hand, proteins and RNAs may be 
present in high copy numbers, and can therefore be described by macrosco- 
pic variables in the large volume limit. On the other hand, the gene, which 
is a DNA fragment, is a single molecule. We assume that it can only be 
in two distinct states, bound or unbound, and therefore must be described 
mathematically by a binary variable. In most theoretical treatments, the 
difficulty of including this binary variable is circumvented by assuming that 
fluctuations of the gene state are fast compared to transcription, transla- 
tion, and degradation. Under this aproximation, the binary gene state gene 
is replaced by a continuous variable which is the average occupancy time. 

To explore the rich dynamics induced by fluctuations associated to gene 
state switching, a stochastic treatment is needed. A key point is that fluc- 
tuations interact with nonlinearities and modify the mean-field behavior, 
because averaging does not commute with evaluating a nonlinear function 
(the definition of variance is the simplest example of this fact). In particu- 
lar, fluctuations may destabilize a system towards an oscillatory behavior. To 
capture the role of fluctuations in a simple setting, we study a model where 
nonlinearity only occurs in the transcriptional regulation by a monomer, 
RNA and protein degradation being linear. The most general probabilistic 
description of the chemical reaction network (1) is provided via the chemical 
master equation. If Pg t m iP (t) denotes the probability to find a bound (resp., 
unbound) gene, represented by g = (resp., g = 1), accompanied by m 
ARN and p protein copies at time t, its time evolution is governed by the 
following master equation in the limit where 8 /a is large and [i is zero : 

"^Pg,m,p — ( 1)^ [(%pPg,m,p @Pg,m,p\ Sg t \\ [Pl,m— l,p Pg,m,p] 

-\-f3l7l \Pg m p—\ Pg,m,p\ ~\~ Sm [(^1 1) Pg,m+l,p Pg,m,p\ 
+5 p [(p + 1) Pg,m,p+1 ~ pPg,m,p] ■ 

(2) 

The asymptotic probability distribution satisfying ([2]) has been widely in- 
vestigated pHHH] but gives only a static picture of the dynamics, averaged 
over time. To better understand the influence of stochastic fluctuations on 
the temporal dynamics of the self-repressing gene, we reformulate the mas- 
ter equation as an infinite hierarchy of coupled differential equations whose 
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variables are the joint cumulants of random variables g, m and p. To be spe- 
cific, the first-order and second-order joint cumulants of random variables x 
and y are the averages and the covariances 

( x ) = ^2 xP 9,™,p, &x,y = (xy) - (x) (y) , 

g,m,p 

while the third-order joint cumulants are defined by 

K x , y ,z = (xyz) - (x) (y) (z) - (x) A VtZ - (y) A XtZ - (z) A x , y . 

By rescaling parameters and joint cumulants according to 

n = h r 9 = l; r m = ^ r p = £; 6 = r t 8; A = 5=^, 
T = r t t; X = r x (x) ; A xy = r x r y A X)y ; 
K X ,Y,Z = r x r y r z K x ^ y>z x,y,z £ {g, m,p}; 

(3) 

the following normalized time evolution equations for averages and cova- 
riances are obtained : 



% = 5(M-P); 
= AG - M; 



dM 
dT 



dG 
dT 

dA 



p,p 



dT 



dT 



@(1-G-GP-A g ,p); 

25 (A PM - Ap, P ) ; 

A [2A GtM + fiG] - 2A MjM ; 



%^ = AA GtP -(d + l)Ap :M + SA M:M ; 

= 5[A GM - A G , P ] - 9 [K Gi p P + GApp + (P + 1) A g ,p] 



dA 



G,M 



dT 



AG (1 - G) - A GM - G [K GM ,p + GAp M + (P + 1) A GyM ] • 

(4) 

Since G is a binary variable, A G>G = G(l — G) and is slaved by G. 
Because of the nonlinearity, Eqs. ([4]) do not form a closed system as the 
time derivatives of joint cumulants involve joint cumulants of higher order. 
In particular, Eqs. do not constrain the two third-order joint cumulants 
Kg,m,p ano - Kg,p,p which are unspecified. As is well known, the simplest 
way to truncate the hierarchy of moment equations is to set all second- 
order cumulants to zero |12j . This neglects all fluctuations and leads to 
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deterministic rate equations for the time averages. Here, these equations 
predict that only stable stationary states can be observed in the system 
studied. 

Incorporating fluctuations in the dynamics requires truncating the hie- 
rarchy at a higher order. As a first approximation, it would seem natural to 
set third-order joint cumulants to zero, which amounts to represent the sto- 
chastic variables by Gaussian probability distribution functions whose means 
and variances vary with time and interact with each other. In our case, ho- 
wever, this is not necessarily a correct approach because of the coexistence 
of macroscopic variables {g and p) interacting with a binary (microscopic) 
variable (g). In the following, we consider two limiting cases depending on 
the value of the gene response time. 

In the first case, we assume that the unbinding and binding rates are 
very large compared to other dynamical rates and keep their ratio constant 
(0 — > oo, 6 jot constant). The gene remains bound or unbound for very 
short amounts of time, during which mRNA and proteins copy numbers 
can be considered as constant. RNA and protein levels keep a memory of 
many previous state switching cycles, and reach a stationary state with 
an expected gaussian distribution. In this case the third-order cumulants 
Kg,m,p and Kg,p,p vanish so that Equations (JH) become closed. In the 
limit where the overall transcription rate is large (A 3> 1), the stationary 
state is given by 

G-.AJ^-L^^^); A^Zi( 1+2 _L_)(5 a ) 

p- = m* = A- PP = a; ;p = a;„, = Va (i + 2@| , + <s) ) (5b) 

where we include the correction to first order in _1 . An interesting finding 
is that this correction only depends on the combination 0(l + <5)/<5. A linear 
stability analysis then indicates that the stationary state is always stable, 
in agreement with the rate equation approximation. 

Conversely, let us assume that the gene reacts infinitely slowly (0 — > 0). 
The dynamics is then driven by the gene jumping between two states accor- 
ding to a Poisson process. When the gene is active (Gon = 1), protein and 
RNA levels quickly converge to high level states Mon = Pon = A. When 
the gene is inactive, protein and RNA levels are low {Moff = Poff = 0). 
At the end of an ON/OFF cycle, the system is always in the same state, with 
no memory of previous cycles. Protein temporal profiles feature a sequence 
of spikes, distributed in time according to a Poisson process. 
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In this limit case, averages, covariances and third-order joint cumulants 
can easily be derived thanks to the high correlation between variables. The 
gene is active during a time toN = 1/(0 A) and inactive during a time toFF = 
1/9 so that its average activity is G* = toN/(toN + toFF) = 1/(1 + A) and 
P* = M* = G*A. If the transcription rate is large (A » 1), the different 
variables scale according to : 

n* ~ a-1. p* — yr* — A* — A* ~ 1- 
u — iv ' a G,P — G,M — x ' 

A * — A * — A* — K'* — J5T* ~ A ^ ' 

If we further assume that 0A <C (5,1, the following reduced system un- 
couples from the other equations, regardless of whether the third-order joint 
cumulants are vanishing or not : 

f t P = 5(M-P); £ t A G , P = (5(A g ,a/-A g ,p); 

f t M = AG - M; |A G>M = AG(l-G)- A GM ; (7) 

£ t G = Q(l-G-GP-A G , P ). 

Besides the averages, this system involves the covariances of the gene state 
variable with protein and mRNA levels. The stationary state of Eqs. ([7|) 
is given by G* = A -1 , with all other variables equal to 1, thus satisfying 
scaling ([6|) exactly. The dynamical behaviour of Eqs. ([7]) can be studied 
by carrying out a stability analysis around the fixed point. It reveals that 
unlike with the mean- field equations, the system exibits a Hopf bifurcation 
leading to oscillatory behaviour, revealing that it can be destabilized by 
the stochastic fluctuations. Under the approximation A> 1, the oscillation 
criterion is simply 



H{G) = 46 2 + 9 



2 ( 1 + *)-(TT7) A 



+ 5 <0. (8) 



If the gene is infinitely slow (0 — >• 0), the criterion is never satisfied because 
T~L(0) = 5 > 0. For intermediate values of G, the quantity % can be become 
negative provided A is high enough and 5 is not too large compared to 1. 
However this can only occur when AO > 6 + 1, which a priori conflicts 
with the assumptions under which the reduced model (|7|) has been derived. 
We therefore carry out stochastic simulations of the chemical model (1) to 
assess the relevance of the two truncations of the moment equation hierarchy 
considered here : Eqs. (|7|) or Eqs. © with third-order cumulants set to zero. 

We first study how the average value of the gene activity as a function of 
is reproduced by the two truncations. Fig {U-A) shows how this average 
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depends on G and 5 due to the presence of fluctuations, an effect which is 
not captured by the mean-field equations. An important result is that the 
reduced parameter (l + <5 _1 ) which appeared in ([5]) is indeed the main 
parameter controlling the fixed point location throughout the range of Q 
explored, as curves obtained for various values of 6 and S superimpose re- 
latively well. The limit values predicted by ([6]) and ([5]) for the two extreme 
regimes 0—^0 and 6 — > oo, respectively, are indeed recovered. Fig [Q-B 
displays the average gene activity predicted from the fixed point of Eqs. 
with the third-order cumulants set to zero. The agreement with stochastic 
simulations is fairly good : the limit values in FigQjA and FigQjB are iden- 
tical, the same global shape with a maximum around = 1/(1 + 8 ) is 
observed and the evolution of the maxima with 5 is correctly reproduced. 
The main discrepancy is that the transition from the fast to the slow gene 
regime is more abrupt in Fig ([U-B) than in Fig ([TJ-A). In contrast to this, the 
fixed point of Eqs. (JT|) does not depend on nor 8, thus incorrectly predicts 
a constant gene activity (see FigOJC). At this stage, the model (jH) with 
vanishing third-order cumulants correctly captures the effect of fluctuations 
but not the simpler model ((ZJ). 
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Figure 1 - Average gene activities as a function of the reduced 
parameter (l + <^ _1 ) (A) Numerical estimation using stochastic simu- 
lations with parameter values : 6/a = 100, A = 200, A = /3. Each color 
corresponds to a given value of 5, varying from 10 2 (black) to 10 (cyan) (B) 
Fixed point value of gene activity in model ([!]) with vanishing third-order 
joint cumulants. (C) Fixed point value of gene activity in model ([7]). Arrows 
indicate the gene activity limiting values given by © for large and by © 
for small 0. 

Let us now consider the dynamics of protein level fluctuations. Without 
feedback, gene switching is a purely Poissonian process. Protein levels fol- 
low the gene state with a characteristic time scale determined by protein 
and mRNA lifetimes. With feedback, the probability of switching evolves 
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rapidly in time as protein levels increase (gene is active) or decrease (gene 
is inactive). This feedback may reduce the stochasticity arising from gene 
switching, with protein peaks occur more regularly. 

The regularity of a stochastic oscillatory behavior is often quantified 
using a temporal autocorrelation function, which is sensitive to reproduci- 
bility both in time and in amplitude. However, temporal regularity is cer- 
tainly more relevant than amplitude regularity for biological protein signals. 
The highly nonlinear response of many signaling cascades can protect them 
against fluctuations in amplitude, for example by saturating output above 
an input threshold. A standard technique for assessing temporal regularity 
is to divide the state space into two regions I and II and to study the distri- 
bution of the times where the system leaves I to enter II. It is often useful to 
require a minimal excursion in region II to avoid spurious transitions induced 
by noise. Here, we detect events where the protein level crosses successively 
the mean protein level P* and the P * = P* + 0.25 y^pp level before falling 
back below the mean protein level. 

Given the list of times where the system transits from low to high protein 
levels, we compute the probability of detecting n transitions within a time 
interval of fixed duration. The probability distribution is then characterized 
by its variance to mean ratio (Fano factor). This method is inspired by how 
the temporal distribution of photons from a light source is generally charac- 
terized, with the event of interest being a photon detection. A Fano factor 
close to unity indicates that transition times follow a Poison distribution. A 
Fano factor greater (less) than unity indicates a super- Poissonian (resp., sub- 
Poissonian) distribution corresponding to a bunching (resp., anti-bunching) 
of transition events. Transition anti-bunching can be viewed as a stochastic 
counterpart of deterministic oscillations. 

Fig [2] shows stochastic simulations of the chemical reaction network (1) 
for a slow, an intermediate and a fast gene, as well as the probability dis- 
tribution of the number n of transitions within a given time window. As 
expected, protein spikes in the slow gene case (Fig [2]-A) are slaved to the 
switching process, leading to a Poisson probability distribution for n (|2}-D) 
and accordingly a unity Fano factor. In the intermediate gene response time 
case (Fig[2]-B), protein spikes are mostly antibunched (see red circles). The 
probability distribution of spike number is gaussian-like (J2J-E), the Fano fac- 
tor being around 0.35. This anti-bunching degrades in the case of a fast gene 
(Fig 0-C) with the Fano factor rising to 0.5. Thus, we oberve a resonance 
effect involving the time in which the gene responds to protein variations 
and the time during which previous gene states are remembered, which is 
controlled by the protein and RNA decay rates. 
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Figure 2 - Dynamical behaviour. Time evolution of protein copy number 
for A = 200, 5 = 1 and 9 = 5.10~ 3 , 0.5, 500 (A,B,C). The dashed lines 
indicate mean protein level and mean protein level plus variance. Red lines 
correspond to the high trigger level and spiking events are indicated by red 
circles. (D, E, F) Probabilities of observing n spikes during a given time 
window for each of the three regimes. 



We have studied systematically how the Fano Factor depends on the gene 
unbinding rate and the relative protein decay rate 5 in stochastic simula- 
tions of reaction network (1). As Fig ([3]) shows, regularity of protein spikes 
is enforced when (1) the decay rates S p and 5 m are comparable (5 ~ 1) and 
(2) the reduced parameter ©*(l + <5~ 1 ) is close to unity. Quite remarkably, 
the parameter space region where protein spikes are more regularly spaced is 
extremely well approached with the region where the reduced model ([7]) dis- 
plays deterministic oscillations. This suggests that this model captures well 
the dynamical interaction of mean-field variables and fluctuations, although 
it did not reproduce satisfactorily the average gene activities in Fig. [2l This 
probably indicates that the dynamically important joint cumulants are those 
involving the gene state variable. This is not surprising given that gene state 
remains binary in all limits and is thus the most stochastic variable. 

In conclusion, we have studied the stochastic time evolution of the self- 
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Figure 3 - Fano factor. Dependence of the Fano factor F quantifying 
spiking regularity on 5 and O * + Stochastic simulations of net- 

work (1) have been carried out with A = 200 ; /3 = A ; 9/a = 100. Different 
values of the Fano factor F are indicated by red hexagons (F < 0.35), blue 
pentagons (0.35 < F < 0.4), green diamonds (0.4 < F < 0.45), cyan tri- 
angles (0.45 < F < 0.5), magenta stars (0.5 < F < 0.7), and orange crosses 
(0.7 < F). The black line encloses the region where the reduced model (|7|) 
oscillates. 

repressing gene and characterized the regularity of protein spikes using a 
Fano-like indicator. This allowed us to evidence a dynamical resonance phe- 
nomenon where a more regular time evolution of protein concentration is 
observed for certain values of the relative protein degradation rate and of 
the gene response time. Average quantities, on one hand, and the location 
of the resonance in parameter space, on the other hand, are reproduced se- 
parately by two reduced deterministic models obtained from a truncation of 
the moment equations hierarchy. It remains to combine these two models to 
reach a global description of how averages and fluctuations interact through 
nonlinearity in the self-repressing gene circuit. 
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